Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

141882981-Add-question-type-to-content-loader-and-summary-display #36

Merged

Conversation

benvand
Copy link
Contributor

@benvand benvand commented Apr 4, 2017

Create handlers for processing and displaying date questions.

  • Create a new Date class for handling date questions.
  • Create a new DateSummary class for handling viewing dates in summary tables etc.
  • Register Date handler in QUESTION_TYPES to make it auto used by all date type questions.
  • Tests.

https://trello.com/c/G3J4fbtw/30-add-question-type-to-content-loader-and-summary-display

benvand and others added 10 commits April 3, 2017 15:11
Add new interfaces for `date` type questions (3 part d, m, y inputs).

* `dmcontent/content_loader.py`
  * Extrapolate out assurance unformatting into new method
  * Add method to unformat dates
  * Add `_is_date` method on content to find out if a question is a date

* `dmcontent/questions.py`
  * Add property `is_date` to question
  * Add `Question` class `Date` as interface for date questions
  * Add `DateSummary` `QuestionSummary` class to present date in given
format
  * Make `date` type questions available in `QUESTION_TYPES`
  * Handle old date strings in date summary
Copy link
Contributor

@idavidmcdonald idavidmcdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like unformat_data for a date question should be on the DateQuestion itself. Similar to how we do for dynamic lists (https://github.com/alphagov/digitalmarketplace-content-loader/blob/master/dmcontent/questions.py#L369) and all question types (https://github.com/alphagov/digitalmarketplace-content-loader/blob/master/dmcontent/questions.py#L114).

However the reason this works (from a quick glance around one of the frontend apps) is because we are calling unformat_data directly on the question in our views (https://github.com/alphagov/digitalmarketplace-supplier-frontend/blob/1bc8418226e15ba6a92f4cff7c25c23dbcf8ddf6/app/main/views/briefs.py#L202) rather than using the content section unformat_data.

Maybe we should be trying to mimic the approach of get_data (https://github.com/alphagov/digitalmarketplace-content-loader/blob/master/dmcontent/content_loader.py#L64) where when getting data for a section we run through every section and then every question letting each question then be responsible for getting it's own data. Doing it that way would feel more object orientated than the current solution. Ideally we wouldn't add a new 'if' statement every time unformatting data is different for a different type of question.

Note, the reason assurance appears to be an exception is because we do not have an AssuranceQuestion, but maybe this could be restructured similarly to be on on question types that can have assurance rather than a content section.

What do you think (especially if I haven't spotted something or my suggestion isn't possible/sensible)?

@property
def value(self):
try:
return datetime.strptime(self._value, '%Y-%m-%d').strftime('%A %-d %B %Y')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are formatting it into the common digital marketplace format we would ideally use the dmutils DATE_FORMAT instead of duplicating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a reason we we've been avoiding that. Has to do with having to add it as a dependency link in the setup.py as digitalmarketplace-content-loader is a module. I've added it, we can chat on Friday.

return ContentQuestion(data)

def test_date_is_formatted_into_user_friendly_format(self):
question = self.question().summary({'example': '2016-02-18'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a test for where we don't include 0 padding i.e. 2?

Either we are saving the month input directly in which case we may see data like {'example': '2017-2-19'} in our DB, and therefore we need to know our DateSummary is able to handle that or we only save data that includes 0 padding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, the test below now addresses this.

"""Retreive the fields from the POST data (form_data).

The d, m, y should be in the post as 'questionName-day', questionName-month ...
Extract them and format as 'YYYY-MM-DD'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess technically we can't gurantee it will be 'YYYY-MM-DD' as the user input could be anything such as 'YYYY-M-D'. This is a really anal comment though...

def unformat_data(self, data):

"""Unpack assurance information to be used in a form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do decide to stick with this solution then documentation for this function needs updating. Also is it worth adding a test to cover unformat_date and maybe calling unformat_data when passing a date field in?

try:
return datetime.strptime(self._value, '%Y-%m-%d').strftime('%A %-d %B %Y')
except ValueError:
return self._value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to include a very short comment explaining that we are falling back to original date format

@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch 2 times, most recently from 8f78029 to 0463711 Compare April 5, 2017 22:16
@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch from 0463711 to f687cd5 Compare April 7, 2017 10:20
@@ -182,6 +185,10 @@ def inject_brief_questions_into_boolean_list_question(self, brief):
def has_assurance(self):
return True if self.get('assuranceApproach') else False

@property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this?


FIELDS = ('year', 'month', 'day')

@property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, is this now redundant?

parts = []
for key in self.FIELDS:
identifier = '-'.join([self.id, key])
value = form_data.get(identifier, '').replace('-', '').strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a test for the stripping - behaviour

'example-day': '19',
'example-month': '03',
'example-year': '',
}) == {'example': None}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this to be None rather than -03-19?


return {self.id: '-'.join(parts) if any(parts) else None}

def unformat_data(self, data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any tests for this method?

def unformat_data(self, data):

"""Unpack assurance information to be used in a form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc block top line doesn't feel correct anymore.

Copy link
Contributor

@idavidmcdonald idavidmcdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good.

Few comments regarding some potentially redundant things, old comments and missing test coverage. It will also need a version bump.

What is your view on using the dmutils dependancy? You seemed to imply there are some downsides to doing this?

benvand added 4 commits April 10, 2017 15:59
digitalmarketplace-content-loader/tests/test_content_loader.py::TestReadYaml::test_loading_existant_file
The above was failing on master, there's been a refactor recently but
not sure what caused it tbh.
Fixed mock to point at instance open that is used by the function being
tested.
Copy link
Contributor

@idavidmcdonald idavidmcdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good Ben.

Think I left just 3 minors now comments. One potential typo and one suggested comment improvement.

The only thing left is regarding the interface for unformat_data which I preferred the old way and am not quite sure if there is a reason for us to do it otherwise. Maybe you could shed some light?

def unformat_data(self, data):
"""Unpack assurance information to be used in a form
"""Method to process form data, special assurance case or individual question level unformat.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might suggest something including something along the lines of into data to be used in a form would be good to have in here as at the moment it reads like you are processing form data rather than processing some data that you can then pass to a form.

@@ -366,7 +369,7 @@ def get_data(self, form_data):

return {self.id: questions_data}

def unformat_data(self, data):
def unformat_data(self, key, data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to have two different interfaces for unformat_data? It felt nice that unformat_data had the same interface regardless of if you were calling it on a Section or Question. That pattern also mimics the get_data usage.

def test_loading_existant_file(self, mocked_open):
assert read_yaml('anything.yml') == {'foo': 'bar'}

@mock.patch.object(builtins, 'open', side_effect=IOError)
@ mock.patch('dmcontent.content_loader.open', side_effect=IOError)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a space in here?

@idavidmcdonald
Copy link
Contributor

I'm slightly confused about what we want unformat_data to do.

Originally unformat_data would take any dictionary of data and return that data, unformatting any keys that were related to that question but everything else would be left untouched
(dynamic list questions are an example https://github.com/alphagov/digitalmarketplace-content-loader/blob/master/dmcontent/questions.py#L369)

Your new examples take a different approach of taking a dict of data but only returning data for that question e.g.

def unformat_data(self, data):
          return {self.id: data[self.id]}

Assuming consistency is desired, raises 2 questions for me:

  1. Should unformat_data only take input related to it's own question id or take any dict of data?
  2. If it should take any dict of data as input, should unformat_data return output only related to it's own question or return all data with it's question unformatted?

@benvand
Copy link
Contributor Author

benvand commented Apr 12, 2017

what we want unformat_data to do.

I mean unformat_data can do whatever we want. But the reasoning behind how it functions now is that it can take all the data from a form into a question and unformat only that data that is relevant to the question.

Should unformat_data only take input related to it's own question id or take any dict of data?

We can't know what fields are relevant to a question before the data goes in so we can't only pass in the relevant fields to be unformatted.

If it should take any dict of data as input, should unformat_data return output only related to it's own question or return all data with it's question unformatted?

It comes down to the whole mutable dict thing.

Originally unformat_data would take any dictionary of data and return that data

There are 2 possibilities here. Let's say the goal of our unformat_data is to capitalize the value:

def unformat_data(self, data):
    for key in data: # iterating over a dict in python only iterates on keys
        if key == self.id:
            data[key] = data[key].capitalize()
    return data

or

# Dict comp version
def unformat_data(self, data):
    return {key: (value.capitalize() if key==self.id else value) for key, value in data.items()}

# For loop version
def unformat_data(self, data):
    d = {}
    for key in data:
        if key == self.id:
            d[key] = data[key].capitalize()
        else:
            d[key] = data[key]

The first mutates the given dictionary, the second creates a new one.
Mutating a dictionary inside a method is an easy way to introduce bugs that are really hard to debug. It's a bit too hidden and under the bonnet for me. I like my methods to explicitly return the values I want to use.

The second (the way it's done now) creates a new dictionary containing all the data

return all data with it's question unformatted

The problem arises with the second one when we start looping over questions.

d={}
for question in section.questions:
    d.update(question.unformat_data(data)) # overwrites unformatted data over and over per field 

it's less valuable to define the new dictionary inside the method because you then lose the knowledge of which fields refer to a given question.

So my way:

def unformat_data(self, data):
    this_questions_data_only =  {}
    for key in data:
        if key == self.id:
            this_questions_data_only(key: data[key].capitalize())
    return this_questions_data_only

Avoids mutation, refers only to the given question, does not return data that may already have been unformatted.
And means that we can construct the dictionary outside the method like so:

d={}
for question in section.questions:
    d.update(question.unformat_data(data)) # DOES NOT overwrite unformatted data over and over per field 

with no side effects.

@idavidmcdonald
Copy link
Contributor

Cool. I think I like your argument (thank's for taking the time to write it out clearly).

It seems sensible to go with your suggestion but we should probably make unformat_data on other Questions follow the same behaviour of ditching unrelated keys (https://github.com/alphagov/digitalmarketplace-content-loader/blob/master/dmcontent/questions.py#L387). It's worth taking a look first through the content loader / front end apps to make sure there is not a dependancy of our frontend to return unrelated keys.

@allait If you've got a minute, would it be possible just for a quick check on our plans for unformat_data to make sure we aren't missing anything?

Copy link
Contributor

@allait allait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment on the pricing unformat_data, but overall this seems sensible and closer to how we delegate other things between section / multiquestion and questions.

It might be worth revisiting the DynamicList implementation and updating it with the new approach (as it seems to look through the data for it's own key at the moment) and since it now should be called from the section.unformat_data we can make this a breaking change if we need to and replace the DynamicList.unformat_data calls in the apps with section.unformat_data.

@@ -464,6 +467,9 @@ def get_question(self, field_name):
if self.id == field_name or field_name in self.fields.values():
return self

def unformat_data(self, data):
return self.get_data(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part is interesting. These 2 lines are confusing because unformat_data and get_data perform opposite operations, so they can't have the same implementation.

The reason we need this with the proposed implementation of ContentSection.unformat_data is that it delegates based on question.get_question and Pricing returns itself whenever any of it's fields are passed in. Which means that if I understand this correctly this method will get called N times for a pricing question with N fields and each time the return value will overwrite all pricing fields in the section's unformatted data dictionary. Which is not a problem because this return value is always the same.

However, this seems to suggest that we might simplify this by passing in the original field key to the quesiton.unformat_data(self, data, key) - which then allows us to use the base class implementation and return {key: data[key]}.

Copy link
Contributor

@allait allait Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the same can be said about get_data and maybe having a similar interface between two methods makes sense.

On the other hand, they iterate over different things: get_data iterates over questions and unformat_data iterates over data keys.

@benvand
Copy link
Contributor Author

benvand commented Apr 18, 2017

@allait I believe the last 2 commits hsould address your comments. I've tried to clean up some of the code and make it clearer with comments what's going on. This has a version bump so do I need to update the changelog with the changes to unformat data on all questions?

@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch from 92b681e to 5823b10 Compare April 18, 2017 17:59
Copy link
Contributor

@idavidmcdonald idavidmcdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good Ben!

One or two whitespace type comments and a question about data not existing but in principle this looks real good.

Just those tiny things to address and I know there is currently a failing Travis test to fix too.

After that it will just be the version bump. Technically we are changing the interface for DynamicList.unformat_data so it would be a breaking change... however I'm not sure if our apps actually rely on it so we might not need to change any code. So I'm not sure if it should be major or minor...

@@ -11,7 +11,7 @@
from werkzeug.datastructures import ImmutableMultiDict

from .errors import ContentNotFoundError, QuestionNotFoundError
from .questions import Question, ContentQuestion
from .questions import Question, ContentQuestion, Date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Date potentially an unused import?

"evidence-0": "Yes, I did."
"nonDynamicListKey": 'other data'
}
"evidence-0": "Yes, I did." }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird bit of whitespace in here

root, index = question.id.split('-')
question_data = data[self.id][int(index)]
if root in question_data:
result.update({question.id: question_data.get(root)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this method reads fairly easily.

Out of interest, is there a particular reason to use .update rather than result[question.id] as before or is this just a personal preference thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one of the reasons I use update is to avoid the similar syntax when updating a list. Means that at a glace I can tell that result is a dict.

>>> x=[1, 2, 3, 4]
>>> y=2
>>> z=3
>>> x[y] = z
>>> x={1:2, 3:4}
>>> x[y]=z
# With the above x can be list or dict.
# With the below it can only be a dict
>>> x.update({y: z})

Another is that I like to use it like so especially in test fixtures:

def get_fake_brief(**kwargs):
    data = {
        'id':1,
        'lastChangedDate': '00:00:00.00000',
        'frameworkSlug': 'digital-outcomes-and-specialists'
        ...
    }
    data.update(kwargs)
    return data

brief1 = get_fake_brief()
brief2 = get_fake_brief(id=2)
brief3 = get_fake_brief(id=3, frameworkSlug='digital-outcomes-and-specialists-2')

Finally in multiple updates it avoids a for loop:

my_nonsense_dict = {}
my_updates = {'foo': 'bar', 'baz': 'quux'}
for key in my_updates:
    my_nonsense_dict[key] = my_updates[key]


my_nonsense_dict = {}
my_updates = {'foo': 'bar', 'baz': 'quux'}
my_nonsense_dict.update(my_updates)

So yeah it's kind of personal preference but founded on the fact that if I want to do the above things and keep my code consistent then I have to use .update everywhere.

@@ -112,7 +115,7 @@ def get_error_messages(self, errors, question_descriptor_from="label"):
return question_errors

def unformat_data(self, data):
return data
return {self.id: data[self.id]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use .get by default in case the key does not exist in data?

For some routes, we call unformat_data on a question directly (https://github.com/alphagov/digitalmarketplace-supplier-frontend/blob/master/app/main/views/briefs.py#L202) and the data might not yet exist.

@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch from 5823b10 to bd77101 Compare April 19, 2017 10:12
@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch from bd77101 to c2f8b4d Compare April 19, 2017 10:39
@benvand
Copy link
Contributor Author

benvand commented Apr 19, 2017

OK, should be good to go @idavidmcdonald
Version bumped and changelog updated

CHANGELOG.md Outdated

### What changed

New question type `Date` and non-backwards compatible change to `Question.unformat_data` which not returns only data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "not" should be "now" I assume

@idavidmcdonald
Copy link
Contributor

There is a single typo that might be nice to fix but apart from that awesome work!
👍 👍 👍 👍 👍 👍 👍
💯 💯 💯 💯 💯 💯 💯
💯 💯 💯 💯 💯 💯 💯
👍 👍 👍 👍 👍 👍 👍

@benvand benvand force-pushed the 141882981-Add-question-type-to-content-loader-and-summary-display branch from 9a2c104 to c5dae54 Compare April 19, 2017 12:57
@benvand benvand merged commit 8ddf063 into master Apr 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants